模式识别与人工智能
Friday, Apr. 4, 2025 Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
Pattern Recognition and Artificial Intelligence  2023, Vol. 36 Issue (11): 1009-1018    DOI: 10.16451/j.cnki.issn1003-6059.202311004
Current Issue| Next Issue| Archive| Adv Search |
Multimodal Fusion-Based Semantic Transmission for Road Object Detection
ZHU Zengle1, WEI Zhiwei2, ZHANG Rongqing3, YANG Liuqing1
1. Intelligent Transportation Thrust, The Hong Kong University of Science and Technology(Guangzhou), Guangzhou 511455;
2. Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University, Shanghai 201210;
3. School of Software Engineering, Tongji University, Shanghai 201804

Download: PDF (1644 KB)   HTML (1 KB) 
Export: BibTeX | EndNote (RIS)      
Abstract  In extreme scenarios with long-tail effects, collaborative perception involving multiple vehicles and sensors can provide effective sensory information for vehicles. However, the differentiation in heterogeneous data, coupled with bandwidth constraints and diverse data formats, makes it challenging for vehicles to achieve unified and efficient scheduling in processing. To organically integrate multi-sensor information among different vehicles under limited communication bandwidth, a semantic communication framework for multimodal fusion object detection based on Transformer is proposed in this paper. Unlike traditional data transmission solutions, self-attention mechanisms are utilized in the proposed framework to fuse data from different modalities, focusing on exploring the semantic correlation and dependencies among modal data. It helps vehicles transmit information and collaborate under limited communication resources, thereby enhancing their understanding of complex road conditions. The experimental results on Teledyne FLIR Free ADAS Thermal dataset show that the proposed model performs well in multimodal object detection semantic communication tasks with accuracy of object detection significantly improved and transmission costs reduced by half.
Key wordsRoad Object Detection      Heterogeneous Data      Semantic Communication      Multimodal Fusion      Self-Attention Mechanism     
Received: 11 October 2023     
ZTFLH: TN919.8  
Fund:National Key Research and Development Program of China(No.2022YFB3104200), General Program of National Natural Science Foundation of China(No.62271351), National Natural Science Foundation of China(No.U23A20339), Guangzhou Municipal Science and Technology Project(No.2023A03J0011), Major Research Project of Department of Education of Guangdong Province(No.2023ZDZX1037)
Corresponding Authors: ZHANG Rongqing, Ph.D., associate professor. His research interests include internet of vehicles, intelligent transportation, multi-agent collaboration and connected intelligence.   
About author:: ZHU Zengle, Ph.D. candidate. His research interests include semantic communication and artificial intelligence. WEI Zhiwei, Ph.D. candidate. His research interests include vehicular fog computing and computational resource allocation. YANG Liuqing, Ph.D., professor. Her research interests include wireless communication networks, multi-agent systems and integrated communication and sensing.
Service
E-mail this article
Add to my bookshelf
Add to citation manager
E-mail Alert
RSS
Articles by authors
ZHU Zengle
WEI Zhiwei
ZHANG Rongqing
YANG Liuqing
Cite this article:   
ZHU Zengle,WEI Zhiwei,ZHANG Rongqing等. Multimodal Fusion-Based Semantic Transmission for Road Object Detection[J]. Pattern Recognition and Artificial Intelligence, 2023, 36(11): 1009-1018.
URL:  
http://manu46.magtech.com.cn/Jweb_prai/EN/10.16451/j.cnki.issn1003-6059.202311004      OR     http://manu46.magtech.com.cn/Jweb_prai/EN/Y2023/V36/I11/1009
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn